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A DNA CONSTRUCT ENCODING THE YAPS SIGNAL PEPTIDE 
FIELD OF INVENTION 

The present invention relates to a DNA construct comprising the 
YAP3 signal peptide for secretion of a heterologous 
5 polypeptide, a yeast cell containing the DNA construct and a 
method of producing heterologous polypeptides in yeast from the 
DNA construct. 

BACKGROUND OF THE INVENTION 

Yeast organisms produce a number of proteins which are 
10 synthesized intracellular ly, but which have a function outside 
the cell. Such extracellular proteins are referred to as 
secreted proteins. These secreted proteins are expressed 
initially inside the cell in a precursor or a pre-protein form 
containing a presequence ensuring effective direction of the 
15 expressed product across the membrane of the endoplasmic 
reticulum (ER) . The presequence, normally named a signal 
peptide, is cleaved off from the rest of the protein during 
translocation. Once entered in the secretory pathway, the 
protein is transported to the Golgi apparatus. From the Golgi 
20 the protein can follow different routes that lead to 
compartments such as the cell vacuole or the cell membrane, or 
it can be routed out of the cell to be secreted to the external 
medium (Pfeffer, S.R. and Rothman, J.E. Ann.Rev.BioghPm- 56 

(1987), 829-852). 

25 Several approaches have been suggested for the expression and 
secretion in yeast of proteins heterologous to yeast. European 
published patent application No. 88 632 describes a process by 
which proteins heterologous to yeast are expressed, processed 
and secreted by transforming a yeast organism with an 

30 expression vehicle harbouring DNA encoding the desired protein 
and a signal peptide, preparing a culture of the transformed 
organism, growing the culture and recovering the protein from 
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the culture medium. The signal peptide may be the signal 
peptide of the desired protein itself, a heterologous signal 
peptide or a hybrid of native and heterologous signal peptide. 

A problem encountered with the use of signal peptides hetero- 
5 logous to yeast might be that the heterologous signal peptide 
does not ensure efficient translocation and/or cleavage after 
the signal peptide. 

The Sj. cerevisiae MFal (a-f actor) is synthesized as a prepro 
form of 165 amino acids comprising signal-or prepeptide of 19 

10 amino acids followed by a "leader" or propeptide of 64 amino 
aicds, encompassing three N-1 inked glycosylation sites followed 
by (LysArg(Asp/Glu, Ala) g.ja-f actor )^ (Kurjan, J. and Herskowitz, 
I. Cell 10 (1982), 933-943). The signal-leader part of the 
preproMFal has been widely employed to obtain synthesis and 

15 secretion of heterologous proteins in cerivisiae . 

Use of signal/leader peptides homologous to yeast is }cnown from 
i.a. US patent specification No. 4,546,082, European published 
patent applications Nos. 116 201, 123 294, 123 544, 163 529, 
and 123 289 and DK patent application No. 3614/83. 

20 In EP 123 289 utilization of the cerevisiae a-factor pre- 
cxirsor is described whereas WO 84/01153 indicates utilization 
of the Saccharomvces cerevisiae invertase signal peptide and DK 
3614/83 utilization of the Saccharomvces cerevisiae PH05 signal 
peptide for secretion of foreign proteins. 

25 US patent specification No. 4,546,082, EP 16 201, 123 294, 123 
544, and 163 529 describe processes by which the a-factor 
signal-leader from Saccharomvces cerevisiae (MFal or MFa2) is i 
utilized in the secretion process of expressed heterologous 
proteins in yeast. By fusing a DNA sequence encoding the 

30 cerevisiea MFal signal/leader sequence at the 5 ' end of the 
gene for the desired protein secretion and processing of the 
desired protein was demonstrated. 
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A niomber of secreted proteins are routed so as to be exposed to 
a proteolytic processing system which can cleave the peptide 
bond at the carboxy end of two consecutive basic amino acids. 
This enzymatic activity is in £^ cerevisiae encoded by the KEX 
5 2 gene (Julius, D.A, et al., Cell 37 (1984b), 1075). Processing 
of the product by the KEX 2 gene product is needed for the 
secretion of active £^ cerevisiae mating factor a (MFa or a- 
factor) but is not involved in the secretion of active S. 
cerevisiae mating factor a. 

10 The use of the mouse salivary amylase signal peptide (or a 
mutant thereof) to provide secretion of heterologous proteins 
expressed in yeast has been described in WO 89/02463 and WO 
90/10075. It is the object of the present invention to provide 
a more efficient expression and/or secretion in yeast of 

15 heterologous proteins. 

SUMMARY OF THE INVENTION 

It has surprisingly been found that the signal peptide of the 
yeast aspartic protease 3 is capable of providing improved 
secretion of proteins expressed in yeast compared to the mouse 
20 salivary amylase signal peptide. 

Accordingly, the present invention relates to a DNA construct 
comprising the following sequence 

5 • -P-SP- (LP) „-PS-HP-3 ' 

wherein 

25 P is a promoter sequence, 

SP is a DNA sequence encoding the yeast aspartic protease 3 
(YAP3) signal peptide, 

LP is a DNA sequence encoding a leader peptide, 
n is 0 or 1, 
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PS is a DNA sequence encoding a peptide defining a yeast 
processing site, and 

HP is a DNA sequence encoding a polypeptide which is 
heterologous to a selected host organism. 

5 The term "signal peptide" is understood to mean a presequence 
which is predominantly hydrophobic in nature and present as an 
N-terminal sequence of the precursor form of an extracellular 
protein expressed in yeast. The fxinction of the signal peptide 
is to allow the heterologous protein to be secreted to enter 

10 the endoplasmic reticulum. The signal peptide is cleaved off in 
the course of this process. The YAP3 signal sequence has been 
reported previously, fused to its native gene (cf. M. Egel- 
Mitani et al., yeast 1990, pp. 127-137. A DNA construct 
wherein the YAPS signal sequence is fused to a DNA sequence 

15 encoding a heterologous polypeptide is believed to be novel. 
The YAP3 signal peptide has not previously been reported to 
provide efficient secretion of heterologous polypeptides in 
yeast. 

In the present context, the expression "leader peptide" is 
20 understood to indicate a peptide whose function is to allow the 
heterologous polypeptide to be directed from the endoplasmic 
reticulum to the Golgi apparatus and further to a secretory ve- 
sicle for secretion into the medium, (i.e. export of the 
expressed polypeptide across the cell wall or at least through 
25 the cellular membrane into the periplasmic space of the cell). 

The expression "heterologous polypeptide" is intended to 
indicate a polypeptide which is not produced by the host yeast 
organism in nature. 

In another aspect, the present invention relates to a 
30 recombinant expression vector comprising the DNA construct of 
the invention. 
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in a further aspect, the present invention relates to a cell 
transformed with the recombinant expression vector of the 
invention . 

in a still further aspect, the present invention relates to a 
5 method Of producing a heterologous polypeptide, the method 
comprising culturing a cell which is capable of expressing a 
heterologous polypeptide and which is transformed with a DNA 
construct of the invention in a suitable medium to obtain 
expression and secretion of the heterologous polypeptide, after 
10 which the heterologous polypeptide is recovered from the 
medium. 



DETAILED DESCRIPTION OF THE INVENTION 

In a specific embodiment, the YAP3 signal peptide is encoded by 
the following DNA sequence 

15 ATG AAA CTG AAA ACT GTA AGA TCT GCG GTC CTT TCG TCA CTC TTT GCA 
TCT CAG GTC CTT GGC (SEQ ID No:l) 

or a suitable modification thereof encoding a peptide with a 
high degree of homology (at least 60%, more preferably at least 
70%, sequence identity) to the YAP3 signal peptide. Examples of 

20 suitable modifications" are nucleotide substitutions which do 
not give rise to another amino acid sequence of the peptide, 
but which nay correspond to the codon usage of the yeast 
organism into which the DNA sequence is introduced, or 
nucleotide substitutions which do give rise to a different 

25 amino acid sequence of the peptide (although the amino acid 
sequence should not modified to the extent that it is no longer 
able to function as a signal peptide), other examples of 
possible modifications are insertion of three or multiples of 
three nucleotides at either end of or within the sequence, or 

30 deletion of three or multiples of three nucleotides at either 
end of or within the sequence. 
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In the sequence 5'-P-SP-(LP)^-PS-HP-3 • , n is preferably 1. In 
other words, although the yAP3 signal peptide may, in some 
instances, in itself provide secretion and/or processing of the 
heterologous polypeptide, a leader or pro-peptide sequence is 
5 preferably present. The leader may be a yeast MFal leader 
peptide or a synthetic leader peptide, e.g. one of the leader 
peptides disclosed in WO 89/02463 or WO 92/11378 or a 
derivative thereof capable of effecting secretion of a 
heterologous polypeptide in yeast. The term "synthetic" is 
10 intended to indicate that the leader peptides in question are 
not found in nature. Synthetic yeast leader peptides may, for 
instance be constructed according to the procedures described 
in wo 89/02463 or wo 92/11378. 

The yeast processing site encoded by the DNA sequence PS may 
15 suitably be any paired combination of Lys and Arg, such as Lys- 
Arg, Arg-Lys, Lys-Lys or Arg-Arg, which permits processing of 
the heterologous polypeptide by the KEX2 protease of 
Sapc^^ypmyc^s cerevisiae or the equivalent protease in other 
yeast species (D.A. Julius et al.. Cell 32, 1984, 1075 ff.). If 
20 KEX2 processing is not convenient, e.g. if it would lead to 
cleavage of the polypeptide product, a processing site for 
another protease may be selected instead comprising an amino 
acid combination which is not found in the polypeptide product, 
e.g. the processing site for FXg, Ile-Glu-Gly-Arg (cf . Sambrook, 
25 Fritsch and Maniatis, Molecula r Cloning: A Laboratory Manual , 
Cold Spring Harbor, New York, 1989) . 

The heterologous protein produced by the method of the inven- 
tion may be any protein which may advantageously be produced in 
yeast. Examples of such proteins are aprotinin, tissue factor 

30 pathway inhibitor or other protease inhibitors, insulin or 
insulin precursors, human or bovine growth hormone, 
interleukin, glucagon, tissue plasminogen activator, 
transforming growth factor a or ^, platelet-derived growth 
factor, enzymes, or a functional analogue thereof. In the 

35 present context, the term "functional analogue" is meant to 
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indicate a polypeptide with a similar function as the native 
protein (this is intended to be understood as relating to the 
nature rather than the level of biological activity of the 
native protein) . The polypeptide may be structurally similar to 
5 the native protein and may be derived from the native protein 
by addition of one or more amino acids to either or both the C- 
and N-terminal end of the native protein, substitution of one 
or more amino acids at one or a number of different sites in 
the native amino acid sequence, deletion of one or more amino 
10 acids at either or both ends of the native protein or at one or 
several sites in the amino acid sequence, or insertion of one 
or more amino acids at one or more sites in the native amino 
acid sequence. Such modifications are well known for several of 
the proteins mentioned above. 

15 The DNA construct of the invention may be prepared 
synthetically by established standard methods, e.g. the 
phosphoamidite method described by S.L. Beaucage and M.H. 
Caruthers, Tetrahedron Letters '>'>^ 1981, pp. 1859-1869, or the 
method described by Matthes et al., EMBO Journal 3 . 1984, pp. 

20 801-805. According to the phosphoamidite method, 
oligonucleotides are synthesized, e.g. in an automatic DNA 
synthesizer, purified, annealed, ligated and cloned into the 
yeast expression vector. It should be noted that the sequence 
5«-P-SP-(LP)„-PS-HP-3' need not be prepared in a single 

25 operation, but may be assembled from two or more 
oligonucleotides prepared synthetically in this fashion. 

One or more parts of the DNA sequence 5'-P-SP-(LP)^-ps-HP-3 • may 
also be of genomic or cDNA origin, for instance obtained by 
preparing a genomic or cDNA library and screening for DNA 

30 sequences coding for said parts (typically HP) by hybridization 
using synthetic oligonucleotide probes in accordance with 
standard techniques (cf. Sambrook, Fritsch and Maniatis, 
Molecular Clonin g : A Laboratory Manual . Cold Spring Harbor, New 
York, 1989). In this case, a genomic or cDNA sequence encoding 

35 a signal peptide may be joined to a genomic or cDNA sequence 
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encoding the heterologous protein, after which the DNA sequence 
may be modified by the insertion of synthetic oligonucleotides 
encoding the sequence 5'-P-SP-(LP)^-PS-HP-3 • in accordance with 
well-known procedures. 

5 Finally, the DNA sequence 5*-P-SP-(LP)„-PS-HP-3 ' may be of mixed 
synthetic and genomic, mixed synthetic and cDNA or mixed 
genomic and cDNA origin prepared by annealing fragments of 
synthetic, genomic or cDNA origin (as appropriate), the 
fragments corresponding to various parts of the entire DNA 
10 sequence, in accordance with standard techniques. Thus, it may 
be envisaged that the DNA sequence encoding the signal peptide 
or the heterologous polypeptide may be of genomic or cDNA 
origin, while the sequence 5'-P-SP-(LP)^-PS may be prepared 
synthetically. 

15 The recombinant expression vector carrying the sequence 5«-P- 
SP-(LP)^-PS-HP-3' may be any vector which is capable of 
replicating in yeast organisms. In the vector, the promoter 
sequence (P) may be any DNA sequence which shows 
transcriptional activity in yeast and may be derived from genes 

20 encoding proteins either homologous or heterologous to yeast. 
The promoter is preferably derived from a gene encoding a 
protein homologous to yeast. Examples of suitable promoters are 
gaccharomvces cer evisiae MFal, TPI, ADH I, ADH II or PGK 
promoters, or corresponding promoters from other yeast species, 

^ Schizosaccharomvces pombe . Examples of suitable promoters 
are described by, for instance, Russell and Hall, j. Biol. 
Shein^ 25S, 1983, pp. 143-149; Russell, Nature 301 , 1983, pp. 
167-169; Ammerer, Meth, Enzvmol. 101 , 1983, pp. 192-201; 
Russell et al., Biol. Chem. 258, 1983, pp. 2674-2682; 

30 Hitzeman et al, J. Biol, chem. 225, 1980, pp. 12073-12080; 
Kawasaki and Fraenkel, Biochem. Biophvs. Res, coTmn. ips . 1982, 
and T. Alber and G. Kawasaki, J. Mol. ApdI, Genpf . i , 1932, pp. 
419-434. 
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The sequences indicated above should also be operably connected 
to a suitable terminator, e.g. the TPI terminator (cf . T. Alber 
and G. Kawasaki, J. Mol. Anm r.^n^4., ^ 1932, pp. 419-434), or 
the yeast CYCl terminator. 

5 The recombinant expression vector of the invention further 
comprises a DNA sequence enabling the vector to replicate in 
yeast. Examples of such sequences are the yeast plasmid 2m 
replication genes REP 1-3 and origin of replication. The vector 
may also comprise a selectable marker, e.g. the Schizo- 
10 gacc^ayoinyppp pninhf. TPI gene as described by P.R. Russell, Gene 
Ifl, 1985, pp. 125-130, or the yeast URA3 gene. 

The procedures used to insert the sequence 5 ' -P-SP- (LP) „-ps-hp- 
3- into a suitable yeast vector containing the information 
necessary for yeast replication, are well known to persons 

15 skilled in the art (cf., for instance, Sambrook, Fritsch and 
Maniatis, PPtgAt;. ) . It will be understood that the vector may 
be constructed either by first preparing a DNA construct 
containing the entire sequence and subsequently inserting this 
fragment into a suitable expression vector, or by sequentially 

20 inserting DNA fragments containing genetic information for the 
individual elements (such as the promoter sequence, the signal 
sequence, the leader sequence, or DNA coding for the 
heterologous polypeptide) followed by ligation. 

The yeast organism transformed with the vector of the invention 
25 may be any suitable yeast organism which, on cultivation, pro- 
duces large amounts of the heterologous polypeptide in 
question. Examples of suitable yeast organisms may be strains 
of SaccharomycRs, such as Saccharomvces r«.T«.^r^ , sac- 

charopyc?es Hluyverj , or Saccharomvces nv;,T.ii,n 

Sg^A? osa cnharomycep , such as Schizosaerharomvcps p nmho, 
y^uyv^rowrPP , such as Kluyveyppy^es lactis . Yarrowia . such as 
y^yyovAg Lipolytic^, or Hansenula, such as Hansenula 
polymorphii^ . The transformation of the yeast cells may for 
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instance be effected by protoplast formation followed by 
transformation in a manner known per se . 

The medium used to cultivate the cells may be any conventional 
medium suitable for growing yeast organisms. The secreted 
5 heterologous protein, a significant proportion of which will be 
present in the medium in correctly processed form, may be 
recovered from the medium by conventional procedures including 
separating the yeast cells from the medium by centrifugation or 
filtration, precipitating the proteinaceous components of the 
10 supernatant or filtrate by means of a salt, e.g. ammonium 
sulphate, followed by purification by a variety of 
chromatographic procedures, e.g. ion exchange chromatography, 
affinity chromatography, or the like. 

BRIEF DESCRIPTION OF THE DRAWINGS 

15 The invention is further described in the following examples 
with reference to the appended drawings wherein 
Fig. lA and IB schematically show the construction of plasmid 
pLaC257; 

Fig. 2 shows the DNA seguence and derived cunino acid sequence 
20 of the EcoRI-Xbal insert in pLaC257 (SEQ ID No:2); 

Fig. 3A and 3B schematically show the construction of plasmid 
pLaC242Apr; 

Fig. 4 shows the DNA sequence and derived amino acid sequence 
of the EcoRI-Xbal fragment of pAPRScl, wherein the protein 
25 sequence shown in italics is derived from the random expression 
cloned DNA fragment (SEQ ID No: 4); 

Fig. 5 schematically shows the construction of plasmid pLaC263; 

Fig. 6 shows the DNA sequence and derived amino acid sequence 
of the EcoRI-Xbal fragment of pLaC263 (SEQ ID No: 6); 
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Fig. 7A and 7B show the DNA sequence and derived amino acid 
sequence of human tissue factor pathway inhibitor (TFPI) 
including its native signal peptide (SEQ ID No: 8) 

Fig. 8A shows the DNA sequence and derived amino acid sequence 
5 of the spx3 signal peptide and 212 leader peptide (shown in WO 
89/02463) N-terminally fused to the TFPI sequence in plasmid 
pYES-212 TFPI161-117Q (SEQ ID No:10); 

Fig. 8B shows the DNA sequence and derived amino acid sequence 
of the YAP3 signal peptide and 212 leader peptide N-terminally 
10 fused to the TFPI sequence in plasmid pYES-yk TFPI161-117Q 
(SEQ ID No: 12) ; and 

Fig. 9 shows restriction maps of plasmids pyES21, pP- 
212TFPI161-117Q; pYES-212TFPI161-117Q and pYES-ykTFPI161-117Q. 

The invention is further illustrated in the following examples 
15 which are not in any way intended to limit the scope of the 
invention as claimed. 

EXAMPLES 

Plasmids and DNA materials 

All expression plasmids contain 2m DNA sequences for 
20 replication in yeast and use either the S^ cerevisiae ura3 gene 
°^ gchizosaccharomvces pombe triose phosphate isomerase 

gene (pot) as selectable markers in yeast. POT plasmids are 
described in EP patent application No. 171 142. A plasmid 
containing the POT-gene is available from a deposited E. coli 
25 strain (ATCC 39685) . The POT plasmids furthermore contain the 
S. ceyevi,si,ae triose phosphate isomerase promoter and 
terminator (P^pjand T^p,) . They are identical to pMT742 (M. Egel- 
Mitani et al., gene 73, 1988, pp. 113-120) (see fig. 1) except 
for the region defined by the Sph-xbal restriction sites 
30 encompassing the P,p, and the coding region for 
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signal/leader/product. The URA3 plasmide use P^p, and the iso-I- 
cytochrome C terminator (T^^) . 

The P^p, has been modified with respect to the sequence found in 
pMT742, only in order to facilitate construction work. An 
5 internal SphI restriction site has been eliminated by SphI 
cleavage, removel of single stranded tails and religation. 
Furthermore, DMA sequences, upstream to and without any impact 
on the promoter, have been removed by Bal3l exonuclease 
treatment followed by addition of an SphI restriction site 
10 linker. This promoter construction present on a 373 bp Sphl- 
EcoRI fragment is designated P^pj, and when used in plasmids 
already described this promoter modification is indicated by 
the addition of a 5 to the plasmid name. 

Finally a number of synthetic DNA fragments have been employed 
15 all of which were synthesized on an automatic DNA synthesizer 

(Applied Biosystems model 380A) using phosphoramidite chemistry 

and commercially available reagents (S.L. Beaucage and M.H. 

Caruthers (1981) Tetrahedron Letters 21/ 1859-1869). The 

oligonucleotides were purified by polyacrylamide gel 
20 electrophoresis under denaturing conditions. Prior to annealing 

complementary pairs of such DNA single strands these were 

kinased by T4 polynucleotide kinase and ATP. 

All other methods and materials used are common state of the 
art knowledge (J. Sambrook et al.. Molecular Cloning, A 
25 Laboratory Manual, Cold Spring Harbor Laboratory Press) Cold 
Spring Harbor, N.Y. 1989). 

Example 1 

The modified mouse salivary amylase signal peptide (MSA3gp) 
(described in WO 89/02463) of the expression cassette of 
30 plasmid pLSC6315D3 (described in Example 3 of WO 92/11378) 
which contains a DNA sequence coding for the insulin precursor 
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MI3 (B(l-29)-Ala-Ala-Lys-A(l-21)) , was replaced with the YAP3 
signal peptide in the following steps: 

A construct for easy exchange of signal peptides was made. 
Through site-directed mutagenesis an Asp718 site was introduced 
5 just prior to the signal initiation codon in pLaC196<5 (cf . WO 
89/02463, fig. 5), by the double primer method applying a 
mutagenic primer NOR494: 

3 • -ATTTGCTGCCATGGiaCTTTCAGAAGG (SEQ ID No: 14) 

where bold letters indicate mutations and the underlined 
10 sequence indicates the initiation codon. 

The resulting plasmid was termed pLaC196<5-Asp718 (see Fig. 1) . 

The nucleotide sequence of the region covering the junction 
between signal peptide and leader peptide of the expression 
cassette in pLSC6315D3 was modified, by replacing the Apal- 
15 HgiAI restriction fragment with a synthetic DNA stretch, NOR 
2521/2522: 

NOR2521: 5«-CAA CCA ATA GAC ACG CGT AAA GAA GGC CTA 

CAG CAT GAT TAG GAT ACA GAG ATC TTG GAG (SEQ 
ID No: 15) 

M NOR2522: 5«-C CAA GAT CTC TGT ATC GTA ATC ATG CTG TAG 
GCC TTC TTT ACG CGT GTC TAT TGG TTG GGC C (SEQ 
ID No: 16) 

The resulting plasmid was termed pLSC6315D3R (see Fig. i) . 

The Sphl-Asp718 fragment of pLaC196<5-Asp718 was ligated with 
iS Sphl-Mlul cut PLSC6315D3R plasmid and a synthetic stretch of 
DNA encoding the YAPS signal peptide: 



YAP-spl: 5'-GT ACC AAA ATA ATG AAA CTG AAA ACT G^A AGA 
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TCT GCG GTC CTT TCG TCA CTC TTT GCA TCT CAG 
GTC CTT GGC CAA CCA ATA GAC A (SEQ ID No: 17) 



YAP-sp2: 5'-CG CGT GTC TAT TGG TTG GCC AAG GAC CTG AGA TGC 
AAA GAG TGA CGA AAG GAC CGC AGA TCT TAC 
5 AGT TTT CAG TTT CTA TAT TTT G (SEQ ID No: 18) 

The resulting plasmid pLaC257 essentially consists of 
PLSC6315D3, in Which the MSA3 signal peptide has been replaced 
by the YAPB signal peptide (see Fig. 2). 

Yeast transformation: S. cerevisiae strain MT663 (E2-7B XEll-36 
10 a/a, Atpi/Atpi, pep 4-3/pep 4-3) (the yeast strain MT663 was 
deposited in the Deutsche Sammlung von Mikroorganismen und 
Zellkulturen in connection with filing WO 92/11378 and was 
given the deposit number DSM 6278) was grown on YPGaL (1% Bacto 
yeast extract, 2% Bacto peptone, 2% galactose, 1% lactate) to 
15 an O.D. at 600 nm of 0.6. 

iOO ml of culture was harvested by centrifugation, washed with 
10 ml of water, recentrifugated and resuspended in 10 ml of a 
solution containing 1.2 M sorbitol, 25 mM Na^DTA pH = 8.0 and 
6.7 mg/ml dithiotreitol . The suspension was incubated at 30'C 

20 for 15 minutes, centrifuged and the cells resuspended in 10 ml 
of a solution containing 1.2 M sorbitol, 10 mM NajEDTA, 0.1 M 
sodium citrate, pH 0 5.8, and 2 mg Novo2ym*234. The suspension 
was incubated at 30 'C for 30 minutes, the cells collected by 
centrifugation, washed in 10 ml of 1.2 M sorbitol and 10 ml of 

25 CAS (1.2 M sorbitol, lo mM CaClg, 10 loM Tris HCl (Tris = 
Tris(hydroxymethyl)aminomethane) pH = 7.5) and resuspended in 
2 ml of CAS. For transformation, 1 ml of CAS-suspended cells 
was mixed with approx. 0.1 of plasmid pLaC257 and left at 
room temperature for 15 minutes, l ml of (2 0% polyethylene 

30 glycol 4000, 20 mM CaClg, 10 mM CaClj, 10 mM Tris HCl, pH = 7.5) 
was added and the mixture left for a further 30 minutes at room 
temperature. The mixture was centrifuged and the pellet 
resuspended in 0.1 ml of SOS (1.2 M sorbitol, 33% v/v YPD, 6.7 
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mM CaClj, 14 Mg/ml leucine) and incxibated at 30 'C for 2 hours. 
The suspension was then centrifuged and the pellet resuspended 
in 0,5 ml of 1.2 M sorbitol. Then, 6 ml of top agar (the SC 
medium of Sherman et al.. Methods in Yeast Genetics . Cold 
Spring Harbor Laboratory (1982)) containing 1.2 M sorbitol plus 
2.5%agar) at 52*0 was added and the suspension poured on top of 
plates containing the same agar-solidif ied, sorbitol containing 



Transformant colonies were picked after 3 days at 30 'C, 
10 reisolated and used to start liquid cultures. One transformant 
was selected for further characterization. 

Fermentation: Yeast strain MT663 transformed with plasmid 
pLaC257 was grown on YPD medium (1% yeast extract, 2% peptone 
(from Difco Laboratories) , and 3% glucose) . A 1 liter cultxire 
15 of the strain was shaken at 30 to an optical density at 650 
nm of 24. After centrifugation the supernatant was isolated. 

MT663 cells transformed with plasmid pLSC6315D3 and cultured as 
described above were used for a comparison of yields of MI3 
insulin precursor. Yields of MI3 were determined directly on 
20 culture supematants by the method of Snel, Damgaard and 
Mollerup, Chromatoaraphia g^, 1937, pp. 329-332. The results 
are shown below. 



medium. 



plasmid 



MI3 yield 



PSLC63.15D3 (Msa3sp) 
pLaC257 (YAP3) 



120% 



100% 



Example 2 



Plasmid pLSC6315D3 was modified in two steps. First, the MSA3 
signal peptide was replaced by the spx3 signal peptide by 
exchanging the Sphl-Apal fragment with the analogous fragment 
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from pLaC212spx3 (cf. WO 89/02463). From the resulting plasmid 
PSLC63.15SPX3, a 302bp EcoRl-Ddel fragment was isolated and 
fused to the 204 bp Ncol-Xbal fragment of pKFNlOGS (WO 
90/10075) containing the DNA sequence encoding aprotinin via a 
5 synthetic linker DNA, NOR2101/2100 (see Pig. 3) 

NOR2101: 5'-T AAC GTC GC (SEQ ID No:19) 

NOR2100: 5 '-CAT GGC GAC G (SEQ ID No: 20) 

The resulting plasmid, pLaC242-Apr (see Fig. 3), was cleaved 
with Clal, dephosphorylated and applied in cloning of random 
10 5'-CG-overhang fragments of DNA isolated from S. cerevisiae 
strain MT663, according to the description in wo 92/11378 
Transformation and fermentation of yeast strain MT663 was 
carried out as described in Example 1. 

Prom the resulting library yeast transformants harbouring the 
15 plasmid pAPR-Scl (prepared by the method described in WO 
92/11378) containing a leader the sequence of which is given in 
Pig. 4, was selected by screening. The spx3 signal peptide of 
pAPR-Scl was replaced by the YAP3 signal peptide by fusing the 
Sphl-Styl fragment from pLaC257 with the 300 bp Nhel-xbal 
20 fragment of pAPR-Scl via the synthetic linker DNA MH1338/1339 
(see Fig. 5) : 

MH 1338: 5«-CTT GGC CAA CCA TCG AAA TTG AAA CCA G (SEQ ID 
No:21) 

MH 1339: 5 • -CT AGC TGG TTT CAA TTT CGA TGG TTG GC (SEQ ID 
25 No:22) 



The resulting plasmid was termed pLaC263 (see Fig. 5). The DNA 
sequence and derived amino acid sequence of the EcoRI-Xbal 
fragment of pLaC263 appears from Fig. 6. 
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plasmid aprotinin yield 

pAPR-Scl (Spx3sp) 100% 
pLaC263 136% 

Example 3 

5 A synthetic gene coding for human TFPI, the DNA sequence of 
which was derived from the published sequence of a cDNA coding 
for human tissue factor pathway inhibitor (TFPI) (W\xn et al., 
J- Biol. Chem. 263 (1988) 6001-6004), was prepared by step-wise 
cloning of synthetic restriction fragments into plasmid pBS(+). 

10 The resulting gene was contained on a 928 base pair (bp) Sail 
restriction fragment. The gene had 26 silent nucleotide 
substitutions in degenerate codons as compared to the cDNA 
resulting in fourteen unique restriction endonuclease sites. 
The DNA sequence of the 928 bp Sail fragment and the 

15 corresponding amino acid sequence of human TFPI (pre-form) is 
shown in Fig. 7 (SEQ ID No:8). 

This DNA sequence was subsequently truncated to code for a TFPI 
variant composed of the first 161 amino acids. A non- 
glycosylated variant, TFPI,.,^^-ii7Gln in which the AAT-codon for 

20 Asnll? was replaced by CAA coding for Gin was constructed by 
site-directed mutagenesis in a manner known per se using 
synthetic oligonucleotides. The DNA sequence encoding TFPI^.-,^^- 
117Gln was preceded by the synthetic signal-leader sequence 
212spx3 (cf. WO 89/02463), see Fig. 8A. This construction was 

25 inserted into the plasmid pP-212TFPI161-117Q (based on a vector 
of the POT-type (G. Kawasaki and L. Bell, US patent 4,931,373), 
cf. Fig. 8). 

A 1.1 kb SphI-3CbaI fragment containing the coding region for 
212spx3-TFPI^.,^,.117Gln was isolated and cloned into the plasmid 
30 pYES21 derived from the commercially available (Stratagene) 
vector PYES2.0 (cf. Fig. 8). This plasmid contains 2m sequence 
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for replication in yeast, the yeast URA3 gene for plasmid 
selection in uy^3 strains, the ^-lactamase gene for selection 
in E. coli, the ColEl origin of replication for replication in 
— coli, the fl origin for recovery of single-stranded DNA 
5 plasmid from superinfected E. coli strains, and the yeast CYCl 
transcriptional terminator. The Sphl-Xbal fragment was cloned 
into pYES 2.0 in front of the CYCl terminator .The resulting 
plasmid pYES-212TFPI161-117Q (cf. Fig. 9) was cleaved with 
PflMI and EcoRI to remove the coding region for the mouse 
10 salivary amylase signal peptide which was replaced by a doxible- 
stranded synthetic oligonucleotide sequence coding for the YAP3 
signal peptide: 



MHJ 1131 5'AAT TCA AAC TAA AAA ATG AAG CTT AAA ACT GTA AGA 
TCT GCG GTC CTT TCG TCA CTC TTT GCA TCG CAG GTC CTA GGT CAA CCA 
15 GTC A (SEQ ID No: 23) 



MHJ 1132 5'CTG GTT GAC CTA GGA CCT GCG ATG CAA AGA GTG ACG 
AAA GGA CCG CAG ATC TTA CAG TTT TAA GCT TCA TTT TTT AGT TTG 
(SEQ ID No:24) 

resulting in plasmid pYES-ykTFPI161-ll7Q (cf . Fig. 8B and Fig. 
20 9). 

Plasmids pYES-212TFPI161-ll7Q and pYES-y)cTFPI161-117Q were 
transformed into the haploid yeast strain YNG318 ( MATa ura3*5:> 
leu2-42 pep4-2il his4-539 rcir+1 , Plasmid selection was for Ura+ 
cells. Reisolated transformants were grown in 50 ml of 

25 synthetic complete medium lacking uracil (SC-ura) for 3 days at 
30*C. After measuring cell density (OD^qq) , the cultures were 
centrifuged and the resulting supernatants were analysed for 
the level of secreted FXa/TF/FVIIa-dependent chromogenic TFPI- 
activity (P.M. Sandset et al., Thromb.Res, 47 , 1987, pp. 389- 

30 400) . The mean activity measured for supernatants from strains 
containing plasmid pYES-212TFPI16l-ll7Q (i.e. the plasmid 
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containing the mouse salivary amylase signal sequence) was 0.65 
^ U/ml»OD- The mean activity measured for supematants from 
strains containing plasmid pYES-ykTFPIiei-llTQ was 1.00 
U/ml»OD. 
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SEQUENCE USnNG 



10 



(1) GENERAL INPCaRMATICN: 

(i) APPLECaNT: 

(A) NAME: Novo Nordisk A/S 

(B) STREET: Novo Alle 

(C) GUY: Bagsvaerd 

(E) cautmps^: Denmark 

(F) POSTAL CODE (ZIP) : 2880 

(G) TFTFFfCNE: +45 4444 8888 

(H) TFTfTOX; +45 4449 3256 



(ii) TTHE OF INVENITON: A ENA Oanstruct Encoding the YAP3 Signal 
Peptide 

(iii) NUMBER OF SBCOENCES: 24 

15 (iv) CCMEUTER REAEftBIE POBM: 

(A) MEDIUM T5CPE: Floppy disk 

(B) OCMEOTER: im PC ccnpatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1,0, Version #1,25 (EPO) 

20 (2) INFCRMAnCN PGR SBQ ID NO: 1: 

(i) SaSQUENCE CHARACTEiaSTrCS: 

(A) I£NSTH: 63 base pairs 

(B) TSTFE: nucleic r^ rid 

(C) STRANnfFTTOSS; single 
25 (D) TOPODOGV: linear 

(ii) MJIEQDIE TYPE: cSCNA 

(iii) HYPOTHETICAL: NO 

(iii) AMn-SENSE: NO 



30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Sacchartaonyoes oerevisiae 

(xi) SEQUENCE CESCRIPrrCN: SBQ ID NO: 1: 
ATCAAACIGA AAACICTAAG ATCIGOQGTC CmOCTCAC TCTITGCATC TCAGGTOdT 60 
GGC g3 
(2) INPORMATrON FOR SBQ ID NO: 2: 

35 (i) SEQUENCE CHARACIERISTICS: 

(A) lENSIH: 476 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: sir^le 
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(D) TOPOIDCT: linear 

(ii) MDIECCJIE TOPE: ENA 

(iii) HXWlHhTICAL: NO 

(iii) ANTI-SENSE: NO 

5 (vi) C2RIGINAL SOURCE: 

(A) CRGfiNISM: synthetic 

(Ix) FEATURE: 

(A) NAME/KEy: CDS 

(B) IDCanCN: 81.. 452 

10 (ix) EEA3URE: 

(A) NAME/KEy: si^LPeptide 

(B) LOGATTON: 81.. 293 

(ix) FEA!njRE: 

(A) NAME/KEy: matjDeptide 
15 (B) liXanCN: 294.. 452 



(xi) SBCPENCE DESCRIPnCN: SBQ ID NO: 2: 
GAATTCATIC AAGAATAGPT CAAACAAGAA GATEACAAAC TATCAATTTC A33UaCAAIA 60 

TAAAOSAOQG TAOCAAAAIA ATG AAA CIG AAA ACT GIA AGA TCT GOG GTC 110 

Met lys Leu lys Ohr Val Arg Ser Ala Val 
20 -71 -70 -65 

err TOG TCA cic Trr gca tct gag gtc err ggc caa oca aia gag aog i58 

Lsu Ser Ser Leu Ite Ala Ser Gin Val Lai Gly Gin Pro He Asp Ihr 
-60 -55 -50 

OCT AAA GAA GGC CEA CAG CAT GAT TAC GAT ACA GAG ATC TTG GAG CAC 206 
25 Arg lys Glu Gly Leu Gin His Asp Tyr Asp Ttir Glu lie Leu Glu His 
"45 -40 -35 -30 

ATT GGA AGO GAT GAG TEA ATT TTG AAT GAA GAG TAT GIT ATT GAA AGA 254 
He Gly Ser Asp Glu Leu He Leu Asn Glu Glu Tyr Val He Glu Arg 
-25 -20 -15 

30 ACT TIG CAA GOC ATC GAT AAC AOC ACT TTG GCT AAG AGA TIC GTT AAC 302 
ttr leu Gin Ala He Asp Asn Dir Thr Leu Ala Lys Arg Ite Val Asn 
-10 -5 1 

CAA CAC TTG TGC GCT TOO CAC TIG GIT GAA GCT TIG TAC TIG CTT TGC 350 
Gin His Leu Cys Gly Ser His Leu Val Glu Ala Leu Tyr Leu Val cys 
35 5 10 15 

GCT GAA AGA GCT TTC TTC m: ACT OCT AAG GCT GCT AAG GCT AIT CTC 398 
Gly Glu Arg Gly Phe Fhe Tyr Ihr Pro Lys Ala Ala Lys Gly He Val 
20 25 30 35 
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GAAC?AlXX:TCTAOCT0CATCTGClXrTrc 446 
Glu Gin Cys Cys Thr Ser lie Cys Ser Leu Tyr Gin Leu Glu Asn Tyr 
40 45 50 

TGC AAC TaGAOGCAGC OCXXaQGCTC TAGA 476 
5 cys Asn 



(2) INR3RMaTICN PGR SBQ ID NO: 3: 

(i) SBQCJENCE CHARACIERISTICS: 
(A) IZNSIH: 124 amino acids 
10 (B) TyiE: amino acid 

(D) TOPOLDGy: linear 

(ii) MOUCCJLE TXEE: protein 

(xi) SBKJENCE DESCRIPTION: SEQ ID NO: 3: 

Met lys leu Lys Hir Val Arg Ser Ala Val Lai Ser Ser Leu Hie Ala 
15 -71 -70 -65 -60 

Ser Gin Val Leu Gly Gin Pro lie Asp Uir Arg lys Glu Gly Lai Gin 
-55 -50 -^5 -40 

His Asp Tyr A^ Thr Glu lie Leu Glu His He Gly Ser Asp Glu Lai 
-35 -30 -25 

20 ne Ifiu Asn Glu Glu Tyr Val ne Glu Arg !Ihr Lai Gin Ala lie Asp 
-20 -15 -10 

Asn Uir Hir Lai Ala lys Arg Rie Val Asn Gin His Leu cys Gly Ser 
-5 15 

His leu Val Glu Ala Lai lyr Leu Val cys Gly Glu Arg Gly Etie Rie 
25 10 15 20 25 

Tyr Thr Pro Lys Ala Ala lys Gly He Val Glu Gin cys Cys Hir Ser 
30 35 40 

lie cys Ser Leu lyr Gin Lai Glu Asn Tyr cys Asn 
45 50 

30 (2) INPORMATICN FOR SBQ ID NO: 4: 

(1) SBQCIEircE CHARACTERISTICS: 

(A) IfWGlH: 450 base pairs 

(B) TYRE: nucleic acid 

(C) SIRANDECNESS: single 
35 (D) TOPOIOGY: linear 

(ii) MDIBOJIZ TyPE: ENA 

(iii) HyporamcAL: no 
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(iii) anh-sense: ^^o 

(vi) OKEGINAL SOURCE: 

(A) C2RGANISM: synthetic 

(ix) FEAaURE: 
5 (A) NAME^KEy: CDS 

(B) IDCmON: 76.. 441 

(ix) FEAIURE: 

(A) NAME/KEY: sig_j)Qjtide 

(B) IDCKTICN: 76. .267 

10 (ix) EEATORE: 

(A) NAME/KEY: matjqjtide 

(B) lOCanCN: 268.. 441 



(xi) SBQOENCE DESOOTTION: SBQ ID NO: 4: 

GAATrCATIC AAGAATAGnT CAAACAAGAA GATTACAAAC TA3X3^AT1TC AHAGACAAIA 60 

15 T3AA0GATIA AAAGA ATO AAG GCT GIT ITC 111 
Met lys Ala Val Fhe Leu Val Leu Ser Lai He Gly 
-64 -60 -55 

TTCTKTCGQCCCAAOCATaSAAATTCAAAOCAGCT ATA CAA 159 

Rie Cys Trp Ala Gin Pro Ser lys Leu lys Pro Ala Ser Asp lie Gin 
20 -50 -45 -40 

AST err TAG GAC CAT GGT GTG AQG GAG TPC GGG GAA AAC TAT GIT CAA 207 
lie Ifiu Tyr Asp His Gly Val Arg Glu Phe Gly Glu Asn Tyr Val Gin 
-35 -30 -25 



25 



GAG TTG ATC GAT AAC AOC ACT TIG GCT AAC GTC GCC ATS GCT GAG AGA 255 
Glu leu lie Asp Asn Urr Thr Leu Ala Asn Val Ala Met Ala Glu Arg 
-20 -15 -10 -5 

TIG GAG AAG AGA AQG OCT GAT TTC TCT TIG GAA OCT CX3V TAC ACT GCT 303 
lau Glu lys Arg Arg Pro Asp Rie Cys Lai Glu Pro Pro Tyr Dir Gly 
15 10 

30aC3VTCTAAAGCTAGAATCATCAGATACTrCTACAACGa:AAGGCT 351 
Pro Cys lys Ala Arg lie lie Arg Tyr Fhe lyr Asn Ala lys Ala Gly 
15 20 25 

TTGTCTCAAACrTrCGrrTACGCTGQCTGCAGAGCTAAGAGAAAC;^^ 399 
iHu Gin Ihr Ete Val Tyr Gly Gly cys Arg Ala Lys Arg Asn Asn 
35 30 35 40 

TrCAAGTCTGCTGAAGACTCSCATGACAACrTGrrGCTGCTGCC 441 
Rie lys Ser Ala Glu Asp cys Met Arg Ihr cys Gly Gly Ala 
45 50 55 



TAATCTAGA 



450 
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(2) INPCRMZmON FOR SBQ ID NO: 5: 

(i) SBQUINCE OffiRACEEaRISITCS: 

(A) UNcmi: 122 amino acids 

(B) TfPE: amino add 
5 (D) TQPOIiDGY: linear 

(ii) mLEOOLE T£PE: paxtein 

(Xi) SBSmiCE EESCRXPETGN: SBQ ID NO: 5: 

Met lys Ala Val Rie Leu Val Lai Ser Leu lie Gly Phe Cys Trp Ala 
-64 -60 -55 

10 Gin Pro Ser lys Lai Lys Pro Ala Ser Asp lie Gin lie Lai Tyr Asp 
-45 -40 -35 

His Gly Val Arg Glu Hie Gly Glu Asn Tyr Val Gin Glu Laa lie Asp 
-30 -25 -20 

Asn Bit Hhr Leu Ala Asn Val Ala Met Ala Glu Arg Lai Glu Ivs Art? 
15 -15 -10 -5 ^ y 

Arg ProAspRieCysLeuGluPrDPro Tyr Ohr Gly Pro Cys lys Ala 
1 5 10 15 . 

Arg lie lie Arg Tyr Rie Tyr Asn Ala lys Ala Gly Lai cys Gin Tbr 
20 25 30 

2DlheVallVrGlyGlyCysArgAlaIysArgAsnAsnIteIysSerAla 
35 40 45 

Glu Asp cys Met Arg Uir Cys Gly Gly Ala 
50 55 

(2) INFQBMftTICN FOR SEQ ID NO: 6: 

25 (i) SBQCIENCE CHARACTERISTICS: 

(A) I£MdH: 470 base pairs 

(B) TYTE: nucleic acid 

(C) STRANDEENESS: single 

(D) TQPOiOGy: linear 

30 (ii) musOJLE TIFE: ENA 

(iii) HYPCraEnCAL: NO 

(iii) ANII-SEWSE: NO 

(vi) ORIGINAL SOORCE: 

(A) ORSANISM: synthetic 

35 (ix) FEftlORE: 

(A) NAME/KEY: CDS 

(B) LDCATrCN: 81. .461 
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(ix) FEAIURE: 

(A) NAME/KEY: sigjephide 

(B) lOCmCN: 81.. 287 

(ix) FEAIURE: 
5 (A) NAME/KEY: matjjeptide 

(B) IDCanON: 288.. 461 

(xi) SBBJENCE DESa^IPriGN: SBQ ID NO: 6: 

GAATTCMTC AAGAAaaGTT C3^AACAAGAA GAITACAAAC aSVTCAATITC AIACACAAaA 60 

TAAA0GAC3GG TMCAAAAIA ATO AAA CTO AAA ACT GEA AGA TCT GOG GTC 110 
10 Met lys Leu lys Uir Val Arg Ser Ala Val 

-69 -65 -60 

CTTTOSTCACTCTITGCATCTCS^GrcCTrQGCC^ 158 
l£U Ser Ser Leu Ite Ala Ser Gin Val Leu Gly Gin Pro Ser lys Leu 
-55 -50 ^5 

15 AAA CXA GCT AGC GAT AIA CAA AIT CTT TAG GAC CAT OCT GTC AGG GAG 206 
lys Pro Ala Ser Asp lie Gin lie Lai Tyr Asp His Gly Val Arg Glu 
-40 -35 -30 

TTC GOG GAA AAC TOT GIT CAA GAG TIG ATC GAT AAC ACC ACT TTC GCT 254 
Ite Gly Glu Asn lyr Val Gin Glu Leu ne Asp Asn Dir Otor Lai ^ 
20 -25 -20 -15 



AAC GIC GOC ATG GCT G^G AGA TIG GAG AAG AGA AGG OCT GAT TTC TST 
Asn Val Ala Met Ala Glu Arg Leu Glu lys Arg Arg Pro Asp Fhe Cys 
-10 -5 .1 5 



302 



TIGGAAOCTGCATACACTGCTOCATCTAAAGCTAGAATCATCAGA^ 350 
25 Im Glu Pro Pro lyr Ihr Gly Pro cys lys Ala Arg lie lie Arg lyr 
10 15 20 

TTC TPC AAC GOC AAG GCT OCT TIG TCT CAA ACT TIC GIT TAC GCT GGC 398 
i4ie lyr Asn Ala lys Ala Gly Leu cys Gin Hir Hie Val Tyr Gly Gly 
25 30 35 

30 TGC AGA GCT AAG AGA AAC AAC TIC AAG TCT GCT GAA GAC T3C ATG AGA 446 
Cys Arg Ala lys Arg Asn Asn Phe lys Ser Ala Glu Asp Cys Met Arg 
40 45 50 

ACT TCT OCT OCT GCC TAATCTAGA 470 
. Uir Cys Gly Gly Ala 
35 55 



(2) INPOfraiAnCN FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENC?IH: 127 amino acids 

(B) TOPE: amino acid 
40 (D) TOPOLOGY: linear 
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(ii) MDIBCDI£ TOPE: protein 

(xi) SEQUENCE DESaOPrrCN: SBQ ID NO: 7: 

Met Lys Lai lys Ilur Val Arg Ser Ala Val Leu Ser Ser Lai Phe Ala 
-69 -65 -60 -55 

5 Ser Gin Val Lai Gly Gin Pro Ser lys Leu Lys Pro Ala Ser Asp lie 
-50 -45 -40 

Gin lie leu Tyr Asp His Gly Val Arg Glu Phe Gly Glu Asn Tyr Val 
-35 -30 -25 

Gin Glu l£u lie Asp Asn Ihr Hir Leu Ala Asn Val Ala Met Ala Glu 
10 -20 -15 -10 

Arg Ifiu Glu lys Arg Arg Pro Asp Hie Cys Leu Glu Pro Pro Tyr Uir 
"5 1 5 10 

Gly Pro Cys lys Ala Arg lie lie Arg Tyr Phe Tyr Asn Ala Lys Ala 
15 20 25 

15 Gly l£u Cys Gin Ihr Phe Val Tyr Gly Gly cys Arg Ala lys Arg Asn 
30 35 40 

Asn Phe lys Ser Ala Glu Asp cys Met Arg Thr Cys Gly Gly Ala 
45 50 55 

(2) INFOTMATICN FOR SEQ ID NO: 8: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) mxsm: 928 base pairs 

(B) lYPEi nucleic acid 

(C) SJKPNDEWESSi single 

(D) TDPOIDGY: linear 

25 (ii) MDIECDIE TYPE: cLNA 

(iii) H3fPaiHErrC3ai: NO 

(iii) Am-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: HcBio sapiens 

30 (ix) EEMURE: 

(A) NAME/KEY: CDS 

(B) IDCAITCN: 8.. 919 

(ix) FEATURE: 

(A) NAME/KEY: sig_peptide 
35 (B) LDCAITON: 8.. 91 

(ix) EEATURE: 

(A) NAME/KEVr: inat_peptide 

(B) LXATTON: 92.* 919 
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(xi) SECPmCE DESCRIPnON: SEQ ID NO: 8: 

GTOGAOC AIG ATT TAC ACA A3G AAG AAA Glk CAT GCA CTT TGG OCT AGC 49 
Met lie lyr Hir Met lys Lys Val His Ala Laa Trp Ala Ser 
-28 -25 -20 -15 

5GIATCX:CTCCTGCITAATClTG0CaCTGa:0CTOT 97 
ValCysIfiuI^IfiuAsnlfiuAlaitoAlaPrDlfiuAsnAlaAspSe^ 
-10 -5 1 



GftGGAAGKTaAGAAO^^^ 145 
* - - — ^jj^ ^jj^ 

10 15 



du Glu Asp Glu Glu His Uir lie lie Ihr Asp Bit Glu Lai Pro Pro 
10 5 10 m 



CTS AAA err ATC CAT TCA TIT TCT GCA ITC AAG GOG GAT GAT GGG CXX: 193 
IfiuIysIfiuMetHisSerFheCysAlaRielysAlaAspAspGlyPro 
20 25 30 

TOT AAA GCAATCATCAAAAGATITTrcTrcAATATrTICACTCX^ 241 
15c^IysManeMetIysArgIteEteRieAsnneI*ielhrArgGln 
35 40 45 50 

TGC GAA GAA TIT Am TAT GGG GGA ICT GAA GGA AAT CAG AAT CGA TTT 289 
Cys Glu Glu Rie ne Tyr Gly Gly Cys Glu Gly Asn Gin Asn Arg Rie 
55 60 65 

20 GAA AGT CTCGAAGAGTGCAAAAAAATGTSrACAAGAGATAATGCA AAC 337 
Glu Ser Ifiu Glu Glu cys Lys lys Met Cys Hir Arg Asp Asn Ala Asn 
70 75 80 

AQG ATT ATA AAG ACA ACA CTO CAG CAA GAA AAG OCA GAT TIC TSC TTT 385 
Arg lie lie Lys Bur Uir Leu Gin Gin Glu lys Pro Asp Fhe cys ite 
25 85 90 95 

TDS GAA Cac GAT OCT GGA ATA TST OGA GCT TAT ATT AOC AGG TAT TTT 433 
l£ai Glu Glu Asp Pro Gly lie cys Arg Gly Tyr He Bir Arg lyr Ete 

100 105 no 

TATAACAATCAGAGAAAACAGTGrGAAAQGTICAAGTATGG^ 481 
30 Tyr Asn Asn Gin Qir lys Gin cys Glu Arg Rie Lys lyr Gly Gly cys 
US 120 125 130 

CIGQGCAATATGAACAATTITGAGACACrcGAGGAATGCAAGAAC^ 529 
Gly Asn Met Asn Asn Rie Glu Bir Leu Glu Glu cys lys Asn lie 
135 140 145 

35 TCT GAA GAT GCT COG AAT GCT TIC CAG CTG CTT AAT TAT GCT AOC CAG 577 
Cys Glu Asp Gly Pro Asn Gly Efie Gin Val Asp Asn Tyr Gly Uir Gin 
150 155 160 

CIC AAT GCT an AAC AAC TOC CIG ACT COG CAA TCA AOC AAG CTT OOC 625 
Ifiu Asn Ala Val Asn Asn Ser Leu Hir Pro Gin Ser Hir Lys Val Pro 
40 165 170 175 
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AGC err TTT GAA TTC CAC QCTT OCT TCA TOG TCT CTC ACT CCA GCA GAT 
Ser Leu Rie Glu Rie His Gly Pro Ser Trp Cys Leu Thr Pro Ala Asp 
180 185 190 

AGAGGAITCTCTCGTGCrAATGAGAACAGATTCTAC^^ 
SArgGlyLajCysArgAlaAsnGluAsnArgPheiyriyrAsnSerVal 
195 200 205 210 

ATT GOG AAA TGC OGC OCA TTT AAG TAC TOC GGA ICT GGG GGA AAT GAA 
lie Gly lys cys Arg Pro Hie Lys Tyr Ser Gly cys Gly Gly Asn Glu 
215 220 225 

10 AAC AAT TTT ACT ACT AAA CAA GAA OCT CDG AQG GCA TOC AAA AAA OCT 
Asn Asn E*ie Hir Ser Lys Gin Glu cys Leu Arg Ala Cys lys lys Gly 
230 235 240 

TIC ATC CAA AGA AIA TCA AAA GGA OGC CEA ATT AAA AOC AAA AGA AAA 
Ite lie Gin Arg lie Ser Lys Gly Gly Leu He Lys Hir lys Arg lys 
15 245 250 255 

AGA AAG AAG CAG AGA GIG AAA ATA GCA TAT GAA GAA ATT TIT GIT AAA 
Arg lys lys Gin Arg Val lys lie Ala Tyr Glu Glu lie Fte Val lys 
260 265 270 

AAT ATO TGACTQGAC 
20 Asn Met 
275 



(2) JUFOEWATICN PGR SBQ ID ND: 9: 

(i) SEQUENCE CHARACTEKLSEICS: 
(A) lENGIH: 304 amino acids 
25 (B) TOPE: amino acid 

(D) TOPOIDGY: linear 

(ii) MOIECDIE TYPE: protein 

(xi) SEJCyJENCE OESCRIPriCN: SEO ID NO: 9: 

Met lie Tyr Thr Met lys lys Val His Ala Leu Trp Ala Ser Val Cys 
30 -28 -25 -20 -15 

leu Leu Leu Asn Leu Ala Pro Ala Pro Leu Asn Ala Asp Ser Glu Glu 
-10 -5 1 

Asp Glu Glu His Thr He lie Thr Asp Dir Glu Leu Pro Pro Leu lys 
5 10 15 20 

35 l£u Met His Ser Rie Cys Ala Efie Lys Ala Asp Asp Gly Pro cys lys 



Ala lie Met Lys Arg Rie Fhe Hie Asn He Rie T3ir Arg Gin cys Glu 



673 
721 
769 
817 
865 
913 
928 



25 



30 



35 



40 



45 



50 



wo 95/02059 



PCT/DK94/00281 



29 

Glu ttie lie Tyr Gly Gly cys Glu Gly Asn Gin Asn Arg Hie Glu Ser 
55 60 65 

IfiU Glu Glu lys lys Met cys Thr Arg Asp Asn Ala Asn Arg lie 
■70 75 80 

5neIyslhrThrl£uGlnGlnGlurysProAspHieCysRie^ 
^5 90 95 100 

Glu Asp Ito Gly ne C^rs Arg Gly lyr ne Thr Arg lyr Ete lyr Asn 
105 no 115 



10 



Asn Gin Bit Gin cys Glu Arg Hie lys lyr Gly Gly cys Leu Gly 



120 



125 



130 



Asn Met Asn Asn Hie Glu Dir Lai Glu Glu Cys Lys Asn lie Cys Glu 
135 140 145 

A^ Gly Pro Asn Gly Hie Gin Val Asp Asn Tyr Gly Ihr Gin Leu Asn 
150 155 160 

15 Ala Val Asn Asn SerLeuIhrProGlnSerllirlysValPnDSerLai 
165 170 175 180 

Hie Glu Hie His Gly Pro Ser Trp cys Leu Ihr Pro Ala Asp Arg Gly 
1S5 190 195 



20 



l£U Cys Arg Ala Asn Glu Asn Arg Hie lyr Tyr Asn Ser Val He Gly 



200 



205 



210 



lys Cys Arg Pro Hie lys lyr Ser Gly cys Gly Gly Asn Glu Asn Asn 
215 220 225 

HielhrSerlysGlnGlucyBlfiuArgAlaCyslysiysGlyHielle 
230 235 240 

25 Gin Arg He Ser lys Gly Gly Leu ne Lys Uir lys Arg Lys Arg lys 
245 250 255 260 

lys Gin Arg Val lys lie Ala ayr Glu Glu lie Hie Val lys Asn Met 
265 270 275 



(2) INFCmPHJIW FOR SEQ ID NO: 10: 

30 (i) SBQOENCE CHARACTERISTICS: 

(A) lENGIH: 234 base pairs 

(B) TyPE: nucleic acid 

(C) SIS^ANDECNESS: siixfle 

(D) TOPOIDcy: linear 

35 (ii) MDIECUIE TYPE: ENA 
(iii) HypoiHETICAL: NO 
(iii) AWn-SENSE: NO 



10 
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(Vi) ORLCUNMa SCXJRCE: 

(A) CRGANISM: synthetic 

(ix) FEMURE: 

(A) NftME/KEY: CDS 
> (B) LDCarrON: 76.. 234 

(ix) FEATURE: 

(A) NSMEyKEY: sig_peptide 

(B) lOGATTCN: 76. .222 

(ix) PEKTORE: 

(A) KAME/KEY: inatjjeptide 

(B) IDCAnCN: 223.. 234 



(xi) SEQqENCE EESCKLPnCN: SEQ ID NO: 10: 

GAMTCanC AAGAMftGIT CAAACAAGAA GACTACS^C TATCAATTIC ATACACAAIA 60 

TAAAOGACTA AAAGA iOTSAftCGCTGITTICTIGCOTriGTOCTIGATCG^ 111 
15 Meft lys Ala Val Hie Leu Val Leu Ser Leu lie Gly 

-49 -45 -40 

TICTCCTQGQOCCAACtAGICACTQGCGATGAAT^ 159 
Hie Cys Trp Ala Gin Pro Val Bir Gly Asp Glu Ser Ser Val Glu He 
-35 -30 -25 

20oa;GAAGAGTCrCTCATCAICGCTGAAAACAOCACTTro 207 
Pro Glu Glu Ser Leu lie lie Ala Glu Asn Dir Uir Lai Ala Asn Val 
-20 -15 -10 

GOC AIG GCT AAG AQV GAT TCT GM GAA 234 
Ala Met Ala lys Arg Asp Ser Glu Glu 
25-5 1 

(2) INPORMATION PGR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 
30 (D) TOPOLOGY: linear 

(ii) MOIECUIE TYPE: protein 

(xi) SBBIENCE EESCRIPITCN: SBQ ID NO: 11: 

Met lys Ala Ved Hie Leu Val Leu Ser Leu lie Gly Hie cys Trp Ala 
-49 -45 -40 -35 

35 Gin Pro Val Ihr Gly Asp Glu Ser Ser Val Glu He Pro Glu Glu Ser 
-30 -25 -20 

leu lie lie Ala Glu Asn Ihr Hir Leu Ala Asn Val Ala Met Ala Lys 
-15 -10 -5 
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Arg Asp Ser Glu Glu 
1 

(2) INPOFMATrCN FOR SEQ ID NO.: 12: 

(i) SBQDENCE OffiRACTEKISTICS: 
5 (A) IZNSm: 190 base pairs 

(B) T5ffE: nucaeic acdd 

(C) STOftNnfFIKESS: single 

(D) TOPOIiDGY: linear 

(ii) MDLEX3JI£ TYPE: ENA 
10 (iii) HXKJOHhTlCAL: NO 
(iii) AMTI-SENSE: NO 

(vi) CKEGINAL SOURCE: 

(A) OTGANISM: synthetic 

(ix) FEA3T2RE: 
15 (A) NAME/KEY: CDS 

(B) UX3mW: 17.. 190 

(ix) FEATURE: 

(A) NAME;/KE5f : sigLpeptide 

(B) IDCAHCN: 17.. 178 

20 (ix) FEAIURE: 

(A) NAME;/KEy: mtj)eptide 

(B) IDCanCN: 179.. 190 

(xi) SBBJENCE EESaOPnCN: SEQ ID NO: 12: 

GAAaTCAAAC TAAAAA ATC AAG CTT AAA ACT CTA AGA TCT GOG GTC CTT 49 
25 Met lys Leu lys Thr Val Arg Ser Ala Val Lai 

-54 -50 -45 

TOG TCA dC TIT GCA TOG CAG GTC CTA GCT CAA OCA CTC ACT GGC GAT 97 
Ser Ser Lbu Rie Ala Ser Gin Val Laa Gly Gin Pro Val Hit Gly Asp 
-40 -35 -30 

30GAATCATCTGTrGAGAITCXX3GAAGAGTCTCTGATCATCGCTGAAAAC 145 
Glu Ser Ser Val Glu lie Pro Glu Glu Ser Leu lie lie Ala Glu Asn 
-25 -20 -15 

ACrACTTrGGCTAACGTCGOCAroGCTAAGAGAGATTCTGAGGAA 190 
Ihr Hir Leu Ala Asn Val Ala Met Ala lys Arg Asp Ser Glu Glu 
35 -10 -5 I 

(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 
(A) I£NGIH: 58 amino acids 
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(B) T£PB: amino acid 
(D) TOPODDGY: linear 

(ii) MDI£3aJI£ T£PE: paxtein 

(xi) SBQDENCE EESCRIPnON: SBQ ID NO: 13: 

5 Met lys Ifiu lys njr Val Arg Ser Ala Val Lbu Ser Ser Lax Fhe Ala 
-54 -50 -45 -40 

Ser Gin Val Leu Gly Gin I>ro Val Hir Gly Asp Glu Ser Ser Val Glu 
-35 -30 -25 

lie Pro Glu Glu Ser Leu He He Ala Glu Asn Thr ttr Lai Ala Asn 
10 -20 -15 -10 

Val Ala Met Ala Lys Arg Asp Ser Glu Glu 
•5 1 

(2) BIFOraiAnON FOR SBQ ID NO: 14: 

(i) SEQUENCE CHARACIERISTICS: 
15 (A) lENSmz 27 base pairs 

(B) TXPE: nucleic acid 

(C) STOANnRTNESS; single 

(D) TOPOLOGY: linear 

(ii) MOIECUIE TffiE: ENA 

2D (vi) CRIGINAL SOURCE: 

(A) CRGftNISM: synthetic 



(xi) SBSJENCE CESCRIPriON: SBQ ID NO: 14: 
ATITGCIGOC ATOGEAdTT CAGAAQG 
(2) INPOEWATrCN PDR SBQ ID NO: 15: 

25 (i) SEQUENCE CHARACIERISTaCS: 

(A) IfNGIH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

30 (ii) MOIEOJIE TOE: ENA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



(xi) SB^yjENCE DESCRIPnON: SBQ ID NO: 15: 
CAAOCAATAG ACAOGOGTIAA AGAAGGCCEA CaGCaTCATT AOGAIACAGA GATCITCGAG 60 
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(2) INFOKMAnCN FOR SBQ ID NO: 16: 

(i) SEQUENCE CHARACIEEIESTICS: 

(A) I£2«3ra: 62 base pairs 

(B) TYPE: nucleic acid 

> (C) STRANDEXNESS: single 

(D) TOPOLDCT: lijiear 

(ii) MDIEOJIE TXFE: ENA 

(Vi) CfRIGINAL SOURCE: 

(A) OE^SANISM: synthetic 



10 (xi) SEQUENCE EESCRIPErCN: SBQ ID NO: 16: 

OCSg^GftTCIC TCEftlOGTAA TCAIGCTSEA QGOCTICnT AOGOSKnU' AITCGITOGG 60 

oc 

62 

(2) INPaRManCN for SBQ id no: 17: 

(i) SBOPENCE CHARACTERISTICS: 
15 (A) I£NGIH: 87 base pairs 

(B) TyPE : nuc leic acid 

(C) STRANDEENESS: single 

(D) TOPODDGY: linear 

(ii) MOIECDIE TXPE: EKA 

20 (vi) ORIGINAL SODRCE: 

(A) ORGANISM: synthetic 

(xi) SEQUENCE DESCKCPEICN: SEQ ID NO: 17: 
CTACCaAAAT AA3GAAAC1G AAAACTCIAA GATCIGC^ 60 
OTCAGGTOCT TQGCCAAOCA AIAGACA 87 
25 (2) INEO^MATION FOR SBQ ID NO: 18: 

(i) SEQUENCE CHARAOERISTrCS: 

(A) UNGm: 87 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEELWESS: single 
30 (D) TOPOIiOGY: linear 

(ii) MOIECUIE TOPE: ENA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



(xi) SEQUENCE DESCRIPnON: SEQ ID NO: 18: 
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GGCI3IGTCIA TTQCTrGGCX: AAGGAOdGA GKTGCAAAGA GIGAOGAAAG GACC3GCAGftT 60 

CTEACAGTIT TCAGTTPCTA lanTEG 87 

(2) INPORMATrCN FOR SBQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) mxsm: 9 base pairs 

(B) TY5E: nucleic acid 

(C) STRANEEH^ESS: single 

(D) TOPOIDCT: linear 

(ii) MDIEOJIE TOPE: ENA 

10 (vi) CKCGINAL SOURCE: 

(A) CBRGftNISM: synthetic 



(xi) SBBJENCE EESCRIPnON: SBQ ID NO: 19: 
TAAOJrOGC 

(2) INPOBRMATICN PGR SBQ ID NO: 20: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) IINSm: 10 base pairs 

(B) T5fFE: nucleic acid 

(C) SUtANDECNESS: single 

(D) TDPQIOCT: linear 

2D (ii) MDI£CUIE TOPE: ENA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



(xi) SBQUDTCE EESCRIPnCN: SBQ ID NO: 20: 
CAIGGCGAOG 
25 (2) INPOJWftfErCN FOR SBQ ID NO: 21: 

(i) SEQUENCE CHARACTEKESnCS: 

(A) IfNGIH: 28 base pairs 

(B) TOPE: nucleic acid 

(C) SIRANDECNESS: single 
30 (D) TOP0I£3GY: linear 

(ii) M3IECUIE TOPE: ENA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
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CITOGOCAAC CaiOGAAAIT GAAAOCAG 

(2) INPOWMllON FOR SBQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENSIH: 28 base pairs 
5 (B) TOIE: nucleic acid 

(C) STOANnFXMBSS: single 

(D) TOIOLCCT: linear 

(ii) MDIBCDI£ -mE: ENA 

(vi) CfRIGINAL SOURCE: 
10 (A) ORGANISM: synthetic 



28 



(xi) SBBJENCE EESCRIPi'lON: SBQ ID NO: 22: 

CTAGCTOSIT TCAAinTOGA TOC3ITGGC 

(2) INFOracmCN PDR SBQ ID NO: 23: 

(i) SBQDENCE CHARACTERISTICS: 
15 (A) UNSmi 88 base pairs 

(B) TYPE: nucleic acid 

(C) STOANnpTHESS: single 

(D) TOPOTDCT: linear 

(ii) MDIECUI£ TOPE: ENA 

20 (vi) CS^GINAL SOdRCE: 

(A) ORGANISM: synthetic 



(xi) SECPWCE EESCRIPl'lCN: SEQ ID NO: 23: 
AAITCAAACr AAAAAATCAA GCTEAAAACT GOaAGATCTC aOGTOCITrC GTCACTCTTT 60 
QCATOGCAQG TOCIAGGTCA AOCAGTCA 88 
25 (2) INPORMATION FOR SEQ ID NO: 24: 

(i) SEQDENCE CHARACTERISTICS: 

(A) IINGmi 81 base pairs 

(B) TOTE: nucleic acid 

(C) STRANDRTNESS: single 
30 (D) TOPOIOCy: linear 

(ii) MOIECCJIJS TOPE: ENA 

(vi) CasaGINAL SOURCE: 

(A) ORGANISM: synthetic 



(xi) SEQUENCE EESCRIPnON: SBQ ID NO: 24: 
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CrSGTIGACC TAGGAOCTCC GAIGCAAftGA GEGAOGAAAG GAa3GCftGAT CITACaGnT 60 
■D^ftGCTPCKr TnTEWSPIT G 81 
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CLAIMS 

1. A DNA construct comprising the following sequence 

5 ' -P-SP- (LP) „-PS-HP-3 • 

wherein 

S P is a promoter sequence, 
SP is a DNA sequence encoding the yeast aspartic protease 3 
(YAPS) signal peptide, 

LP is a DNA sequence encoding a leader peptide, 
n is 0 or 1, 

10 PS is a DNA sequence encoding a peptide defining a yeast 
processing site, and 

HP is a DNA sequence encoding a polypeptide which is 
heterologous to a selected host organism. 

2. A DNA construct according to claim 1, wherein the promoter 
15 sequence is selected from the Saccharomvces cerevisiae MFal, 

TPI, ADH, BARl or PGK promoter, or the Schizosaccharomvces 
pombe ADH promoter. 

3. A DNA construct according to claim 1, wherein the YAP3 
signal peptide is encoded by the following DNA sequence 

20 ATG AAA CTG AAA ACT GTA AGA TCT GCG GTC CTT TCG TCA CTC TTT GCA 
TCT CAG GTC CTT GGC (SEQ ID No:l) 

or a suitable modification thereof encoding a peptide with a 
high degree of homology to the YAPS signal peptide. 

4. A DNA construct according to claim 1, wherein n is 1. 

25 5. A DNA construct according to claim 5, wherein the leader 
peptide is a yeast MFal leader peptide or a synthetic leader 
peptide . 
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6. A DNA construct according to claim 1, wherein PS is a DNA 
sequence encoding Lys-Arg, Arg-Lys, Lys-Lys, Arg-Arg or Ile- 
Glu-Gly-Arg. 

7. A DNA construct according to claim 1, wherein the 
5 heterologous polypeptide is selected from the group consisting 

of aprotinin, tissue factor pathway inhibitor or other protease 
inhibitors, insulin or insulin precursors, human or bovine 
growth hormone, interleukin, glucagon, glucagon-like peptide 1, 
tissue plasminogen activator, transforming growth factor a or 
10 fi, platelet-derived growth factor, enzymes, or a functional 
analogue thereof. 

8. A DNA construct according to claim 1, which further 
comprises a transcription termination sequence. 

9. A DNA construct according to claim 8, wherein the 
15 transcription termination sequence is the TPI terminator. 

10. A recombinant expression vector comprising a DNA construct 
according to any of claims 1-9. 

11. A cell transformed with a vector according to claim 10. 

12. A cell according to claim 11, which is a fungal cell. 
20 13. A cell according to claim 12, which is a yeast cell. 

14. A cell according to claim 13, which is a cell of 
Saccharomyces, Schizosaccharomvces . Kluweromvces , Hansenula or 
Yarrowia . 

15. A cell according to claim 14, which is a cell of 
25 Saccharomyces cerevisiae or Schizosaccharomvces pombe . 
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16. A method of producing a heterologous polypeptide, the 
method comprising culturing a cell which is capable of 
expressing a heterologous polypeptide and which is transformed 
with a DNA construct according to any of claims 1-9 in a 
5 suitable medium to obtain expression and secretion of the 
heterologous polypeptide, after which the heterologous 
polypeptide is recovered from the medium. 
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Sphl(l) 



1/13 




Ncol(561) 
Xbal (752) 



mpl9RF: Sphl-Xbal 



Xbal 



752 bp Sphl-Xbal 



10 kb Xbal-SphI 
+ 

191 bp Ncol-Xbal 




1 



Primer annealed, 
SphI (1 ) Asp71 8 (442) extended and ligated: 

Klenow polymerase + dNTP 
Ncol(560) + T4 DNA ligase + ATP 

Xbal (751) j 

561 bp Sphl-Ncol 




Fig. 1a 



wo 95/02059 



FCT/DK94/00281 



Sphl(l) Apal(489) 




Fig. ib 
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10 20 30 40 50 60 

I I I I I I 

GGAATTCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTT(^ 



70 80 90 100 110 120 

' I I I I I 

ATAAACGACGGTACCAAAATAATGAAACTGAAAACTGTAAGATCTGCGGTCCTTTCGTCA 

METLysLeiiLysThrValArgSerAlaValLeuSerSer 
YAP3sp 

130 140 150 160 170 180 

' I I I I I 

CTCTTTGCATCTCAGGTCCTrGGCCAACCAATAGACACGCGTAAAGAAGGCCTACAGCAT 

LeuPheAlaSerGlnValLeuGlyGlnProIleAspThrArgLysGluGlyLeuGlnHis 
************************************ 

190 200 210 220 230 240 

< I I I I I 

GATTACGATACAGAGATCTTGGAGCACATTGGAAGCGATGAGTTAATTTTGAATGAAGAG 

AspTyrAspThrGluIleLeuGluHisIleGlySerAspGluLeuIleLeuAsnGluGlu 
****************** 63.15d3 leader ************************** 

250 260 270 280 290 300 

< I I I I I 
TATGTTATTGAAAGAACTTTGCAAGCCATCGATAACyvCCACTTTCGCTAAGAGATTCGTT 
TyrVallleGluArgThrLeuGlnAlalleAspAsnThrThrLeuAlaLysArgPheVal 

******* 

310 320 330 340 350 360 

I > I I I I 

AACCAACACTTGTGCGGTTCCCACTTGGTTGAAGCTTTGTACTTGGTTTGCGGTGAAAGA 
AsnGlnHisLeuCysGlySerHisLeuValGluAlaLeuTyrLeuValCysGlyGluArg 
>»»»»»»»»» Insulin precursor MI3 »»»»»»»»» 

370 380 390 400 410 420 

' I I I I I 

GGTTTCTTCTACACTCCTAAGGCTGCTAAGGGTATTGTCGAACAATGCTGTACCTCCATC 

GlyPhePheTyrThrProLysAlaAlaLysGlylleValGluGlnCysCysThrSerlle 
*^*^»'»»»»»»»»»»»»»»»»»»»»»»»»»».». 

430 440 450 460 470 

I I I I I 

TGCTCCTTGTACCAATTGGAAAACTACTGCAACTAGACGCAGCCCGCAGGCTCTAGA 

CysSerLeuiyrGlnLeuGluAsniyrCysAsn 

»»»»»»»»»»»»>»»»» 



Fig. 2 
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Sphl(1) Apal(489) Sphl(1) 
'Xbal (860) 




\ 419 bp Sphl-Apal 

367 bp Apal-Xbal 
+10 kb Xbal-sph 

I 

Sphl(1) Apal(489) 

Ddel (675) 




Apal (489) 

Xbal (770) 



pKFN1003:204 bp 
Ncol-Xbal 



N0R21 01/21 00 
Dde-Ncol 




Xbal (860) 



302 bp 
EcoRI-Ddei 




Sphl(1) 



373 bp Sphl-EcoRI 
+ 10 kbXbal-SphI 




Clal (502) 

Clal (659) 
Ncol (687) 
Xbal (890) 



51 6 bp 
EcoRI- 
Xbal 

J 



Fig. 3a 
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Sphl(l) Clal(502) 




I 

10 kb Clal-Clal dephosphorylated 
+ 

S.c. MT663 DNA digest with 
5' CG tails 

1 . 

Library of appr. 1 0^ p APRSc's 

■ I 

Appr. 10^ yeast transformants 
screened for trypsin iniiibition 

i 

pAPR.Sc1 isolated. 



Fig. 3b 
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10 20 30 40 50 60 

I i i I I I 

GAATTCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAA^^^ 



70 80 90 100 110 120 

I I I I I I 

TAAACGATTAAAAGAATGAAGGCTGTTTTCTTGGTTTTGTCCTTC 

METLysAlaValPheLeuValLeuSerLeuIleGlyPheCysTrp 
spj^ 3 

130 140 150 160 170 180 

I I I I I I 

GCCCAACCATCGAAATTGAAACCAGCTAGCGATATACAAATTCTTTACGACCATGGTC 
UlaGlnFroSerLysLeuLysProAlaSerAspIleGlnlleLeuTyrAspHisGlyVal 

190 200 210 220 230 240 

I I I I i I 

agggagttcggggaaaactatgttcaagagttgatcgataacaccactttggct;^^ . 

ArgGi uPheGlyGl uAsnTyrValGlnGl uLeuIl eAspAsnThrThrLeuAlaAsnVal 

250 260 270 280 290 300 

I I I I I I 

gccatggctgagagattggagaagagaaggcctgatttctgtttggaacctccatac^ 

AlaMetAlaGluArgLeuGluLysArgArgProAspPheCysLeuGluProProiyrThr 

310 320 330 340 350 360 

I I I I I I 

GGTCCATGTAAAGCTAGAATCATCAGATACTTCTACAACGCCAAGGCTGGTTTGTGTCAA 
GlyProCysLysAlaArgllelleArgTyrPheOyrAsnAlaLysAlaGlyLeuCysGln 
»»»»»»»» Aprotinin »»»»»»»»»»»»»»»»> 

370 380 390 400 410 420 

I I I I I I 

ACTTTCGTTTACGGTGGCTGCAGAGCTAAGAGAAACAACTTCAAGTCTGCTGi^^ 
ThrPheValTyrGlyGlyCysArgAlaLysArgAsnAsnPheLysSerAlaGluAspCys 

430 440 450 

I I I 

ATGAGAACTTGTGGTGGTGCCTAATCTAGA 

METAr gThr Cy s G ly G ly Al a 

>»>>»>>>>>>>>>>>>>> 



Fig. 4 




pLaC263 



wo l>5/02059 PCT/DK94/00281 



8/13 

10 20 30 40 50 60 

I I I I I I 

GGAATTCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTTCATACACAAT 



70 80 90 100 110 120 

I I I 1 I I 

ATAAACGACGGTACCAAAATAATGAAACTGAAAACTGTAAGATCTGCGGTCCTTTCGTCA 

METLysLeuLysThrValArgSerAlaValLeuSerSer 
YAP3sp 

130 140 150 160 170 180 

I I I I I I 

CTCTTTGCATCTCAGGTCCTTGGCCAACCATCGAAATTGAAACCAGCTAGCGATATAC^ 
LeuPheAlaSerGlnValLeuGlyGlnProSerLysLeuLysProAlaSerAspIleGln 

190 200 210 220 230 240 

i I I I I I 

ATTCTTTACGACCATGGTGTGAGGGAGTTCGGGGAAAACTATGTTCAAGAGT^ 
IleLeuTyrAspHisGlyValArgGluPheGlyGluAsniyrValGlnGluLeuIleAsp 

250 260 270 280 290 300 

I i I I I I 

AACACCACTTTGGCTAACGTCGCCATGGCTGAGAGATTGGAGAAGAGAAGGCCTGATTTC 
AsnThrThrLe\iAlaAsnValAlaMetAlaGluArgLeuGluLysArgArgProAspPhe 

»»»»»» 

310 320 330 340 350 360 

I I I I I I 

TGTTTGGAACCTCCATACACTGGTCCATGTAAAGCTAGAATCATCAGATACTTC 
CysLeuGluProProTyrThrGlyProCysLysAlaArgllelleArgTyrPheTyrAsn 
»»»»»»»> Aprotinin »»»»»»»»»»»»»»»»» 

370 380 390 400 410 420 

> I I I I I 

GCCAAGGCTGGTTTGTGTCAAACTTTCGTTTACGGTGGCTGCAGAGCTAAGAGAAAC^ 
AlaLysAlaGlyLeuCysGlnThrPheVal-IVrGlyGlyCysArgAlaLysArgAsnAsn 
»»»>»»»»»»»»»»»»»»»»»»»»»»»»»»> 

430 440 450 460 470 

I I I I I 

TTCAAGTCTGCTGAAGACTGCATGAGAACTTGTGGTGGTGCCTAATCTAGA 

PheLysSerAlaGluAspCysMetArgThrCysGlyGlyAla 

>>»»»>: 



Fig. 6 
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Sail l^j^gj 

GTCGACC ATG ATT TAC ACA ATG AAG AAA GTA CAT GCA CH TGG GCT AGC 49 

Met He Tyr Thr Met Lys Lys Val His Ala Leu Trp Ala Ser 

-28 -25 -20 .15 

SI? J5c fIS f^^ P"^ ^^"^ CCT Cn AAT GCT GAT TCT 97 

Val Cys Leu Leu Leu Asn Leu Ala Pro Ala Pro Leu Asn Ala Asp Ser 
-10 -5 J 

Sad 

Sf!l ATC ACA GAT ACG GAG CTC CCA CCA 145 

Glu Glu Asp Glu Glu His Thr He He Thr Asp Thr Glu Leu Pro Pro 
5 10 15 



193 



fIS S^f I^^ ^ TGT GCA nC AAG GCG GAT GAT GGG^CCC 

Leu Lys Leu Met His Ser Phe Cys Ala Phe Lys Ala Asp Asp Gly Pro 

25 30 

TGT AAA GCA ATC ATG AAA A6A TTT HC TTC AAT AH HC ACT CGA CAG 241 
Cys Lys Ala He Met Lys Arg Phe Phe Phe Asn He Phe Thr Arg Gin 
35 40 45 50 

JSc ^ IP F ^ GGA AAT CAG AAT^CGA TTT 289 

Cys Glu Glu Phe He Tyr Gly Gly Cys Glu Gly Asn Gin Asn Arg Phe 
55 60 65 

GAA AGT CTG GAA GAG TGC AAA AAA ATG TGT ACA AGA GAT AAT GCA AAC 337 
Glu Ser Leu Glu Glu Cys Lys Lys Met Cys Thr Arg Asp Asn Ala Asn 
70 75 80 

AGG ATT ATA AAG ACA ACA CTG CAG CAA GAA AAG CCA GAT TTC TGC TTT 385 
Arg He He Lys Thr Thr Leu Gin Gin Glu Lys Pro Asp Phe Cys Phe 
85 90 95 

BamHI 

TTG GAA GAG GAT CCT GGA ATA TGT CGA GGT TAT ATT ACC AGG TAT TTT 433 
Leu Glu Glu Asp Pro Gly He Cys Arg Gly Tyr He Thr Arg Tyr Phe 
100 105 110 

AStuI 

J?'^ ri'^ ^ '^^^ '^^'^ '^^^ TAT GGT GGA TGC 481 

Tyr Asn Asn Gin Thr Lys Gin Cys Glu Arg Phe Lys Tyr Gly Gly Cys 
120 125 130 

CTG GGC AAT ATG AAC AAT TTT GAG ACA CTC^GAG GAA TGC AAG AAC ATT 529 
Leu Gly Asn Met Asn Asn Phe Glu Thr Leu Glu Glu Cys Lys Asn He 
135 140 145 



Fig. 7a 
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Knpl 

TGT GAA GAT GGT CCG AAT GGT HC CAG GTG GAT AAT TAT GGT ACC CAG 577 
Cys Glu Asp Gly Pro Asn Gly Phe Gin Val Asp Asn Tyr Gly Thr Gin 
150 155 160 

Hpal 

CTC AAT GCT GH AAC AAC TCC CTG ACT CCG CAA TCA ACC AAG GH CCC 625 

Leu Asn Ala Val Asn Asn Ser Leu Thr Pro Gin Ser Thr Lys Val Pro 
165 170 175 

EcoRI 

AGC CTT TTT GAA TTC CAC GGT CCC TCA TGG TGT CTC ACT CCA GCA GAT 673 

Ser Leu Phe Glu Phe His Gly Pro Ser Trp Cys Leu Thr Pro Ala Asp 
180 185 190 

AEcoRV 

AGA GGA TTG TGT CGT GCC AAT GAG AAC AGA HC TAC TAC AAT TCA GTC 721 
Arg Gly Leu Cys Arg Ala Asn Glu Asn Arg Phe Tyr Tyr Asn Ser Val 
195 200 205 210 

BspMII 

ATT GGG AAA TGC CGC CCA TTT AAG TAC TCC GGA TGT GGG GGA AAT GAA 769 
He Gly Lys Cys Arg Pro Phe Lys Tyr Ser Gly Cys Gly Gly Asn Glu 
215 220 225 

Spel SphI 

AAC AAT TTT ACT AGT AAA CAA GAA TGT CTG AGG GCA TGC AAA AAA GGT 817 

Asn Asn Phe Thr Ser Lys Gin Glu Cys Leu Arg Ala Cys Lys Lys Gly 

230 235 240 

StuI 

nc ATC CAA AGA ATA TCA AAA GGA GGC CTA ATT AAA ACC AAA AGA AAA 865 
Phe He Gin Arg He Ser Lys Gly Gly Leu lie Lys Thr Lys Arg Lys 
245 250 255 

AGA AAG AAG CAG AGA GTG AAA ATA GCA TAT GAA GAA AH HT GTT AAA 913 
Arg Lys Lys Gin Arg Val Lys He Ala Tyr Glu Glu He Phe Val Lys 
260 265 270 

Sail 

AAT ATG TGAGTCGAC 928 

Asn Met 

275 



Fig. 7b 
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EcoRI 

5361 GAATTCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTTCATACACAAT 

5420 ATAAACGATTAAAAGAATGAAGGCTGTTTTCTTGGTTTTGTCCTTGATCGGATTCTGCT 

MetLysA 1 aVa I PheLeuVa I LeuSerLeu I I eG I yPheCys 
gpj^3 signal peptide 



rfini BspEI Bell 

5479 GGGCCCAACCAGTCACTGGCGATGAATCATCTGTTGAGATTCCGGAAGAGTCTCTGATC 
TrpA I oG I nProVa I ThrG I yAspG I uSerSerVa I G I u 1 1 eProG I uG I uSerLeu I I e 
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦212 leader ♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 

5438 ATCGCTGAAAACACCACTTTGGCTAACGTCGCCATGGCTAAGAGAGATTCTGAGGAA 
1 1 eA I aG I uAsnThrThrLeuA I aAsnVa I A I aMet A I oLysArgAspSerG I uG I u-- 

Kex2 



Fig. 8a 
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536, ^AAllcAAACTAAAAAATGX;G"?ii^CTGTA?!iliTGCG 

MetLysLeuLysThrVa I ArgSerA I oVa I LeuSerSerLeu 
Tap3 signol peptide 

Avr II P f I M T 
leader******-******: 

5538 AGATTCTGAGGAA— 
ArgAspSerGluGlu-- 
♦**<-TFPI -- 



ng. 8b 
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Xbai 




1 



5.3 kb Xbal-SphI 



PflMI 
EcoRI 
SphI 



Xbai 



6.2 kb PflMI-EcoRI 




Xbai 



1.14 kb Sphl-Xbal 




MHjn31/MHJ1132 
(85 bp EcoRI-PflMI) 




'PflMI 
EcoRI 
SphI 




Fig. 9 
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